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Abstract 

i— i We study the nonparametric estimation of the jump density of a com- 

f-H pound Poisson process from the discrete observation of one trajectory 

^0 over [0,T]. We consider the microscopic regime when the sampling rate 

(-h A = Ay — > as T — > oo. We propose an adaptive wavelet threshold 

density estimator and study its performance for the L p loss, p > 1, over 
Besov spaces. The main novelty is that we achieve minimax rates of con- 
vergence for sampling rates Ay that vanish with T at arbitrary polynomial 
rates. More precicely, our estimator attains minimax rates of convergence 
provided there exists a constant K > 1 such that the sampling rate At 
satisfies TA^ +2 < 1. If this condition cannot be satisfied we still provide 
an upper bound for our estimator. The estimating procedure is based on 
the inversion of the compounding operator in the same spirit as Buchmann 
CD and Griibel (2003). 
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1 Introduction 

1.1 Statistical setting 

Let R be a standard homogeneous Poisson process with intensity $ in (0,oo), 
we define the compound Poisson process X as 

Rt 

x t = ^2^, t>0 

where the (£j) are independent and identically distributed random variables 
and independent of the Poisson process R. 

Assume that we have discrete observations of the process X over [0, T] at 
times iA for some A > 

(Xa,...,Xlta-!ja)- (!) 

We focus on the microscopic regime, namely 

A = A T ^ as T^oo 

and work under the following assumption. 

Assumption 1. The law of the & has density f which is absolutely continuous 
with respect to the Lebesgue measure. 

We denote by .F(]R) the space of densities with respect to the Lebesgue measure 
supported by R. We investigate the nonparametric estimation of the density / 
on a compact interval T> included in R from the observations (1). To that end 
we use wavelet threshold density estimators and study their rate of convergence 
uniformly over Besov balls for the following loss function 

mif-fwi^f^ (2) 

where / is an estimator of /, p > 1 and 

l/p 
\f{x)\"dx\ 

I'D 

We also denote by ||/||l (r) the usual L p norm for p > 1 

i/p 



L P (V) = ( / \fix)\ p dx) 



\f(x)\ p dx 



We do not assume the intensity •& to be known: it is a nuisance parameter. 

By Assumption 1, on the event {Xja — ^(j-i)A = 0} no jump occurred 
between (i — 1) A and iA and the increment Aja — Xu_i\& gi yes no information 
on /. In the microscopic regime many increments are zero, therefore to estimate 
/ we focus on the nonzero increments and denote by Nt their number over 
[0,T]. In that statistical context different difficulties arise. First the sample 
size Nt is random. Second on the event {Xja — ^(i-i)A 7^ 0}> the increment 
NiA — -^(i-i)A i s n °t necessarily a realisation of the density /. Indeed even 
if A is small there is always a positive probability that more than one jump 
occurred between (i — 1)A and iA. Conditional on {Xja — -XVt-i)A ^ 0}, the 
law of Xja— ^(j-i)A has density given by (see Proposition 1 in Section 2 below) 

oo 
P A [/](»)= ^IP(^A = m| J RA/0)r m (x), forxGM, (3) 

m=l 

where * is the convolution product and /* m = /*...*/, m times. 

Adaptive estimators of the density / in that statistical context already ex- 
ists. Under the condition TAt < 1 (or TA T < 1 if / is smooth enough), they 
attain minimax rates of convergence over Sobolev spaces for the L<i loss (see 
Bee and Lacour [1], Comte and Genon-Catalot [4, 6] and Figueroa-Lopez [10]). 
In this paper we try to answer the following questions. 

i) Is it possible to construct an estimator of / when At decays slowly to 0, 
for instance when At vanishes polynomially slowly with T . 

ii) Is it possible to construct adaptive wavelet estimators that attain, over 
Besov spaces for the L p loss defined in (2), the classical minimax rates of 
convergence of the experiment where we observe T independent realisa- 
tions of /. 

Without loss of generality, assuming T is an integer if we observe T independent 
realisations of a density / of regularity s measured with the L^ norm, ir > 0, it 
is possible to achieve the minimax rates of convergence for the L p loss -up to 
constants and logarithmic factors- which is of the form 

rp— a(s,ir,p) 

where a(s,n,p) < 1/2 (see for instance Donoho et al. [7] and (16) hereafter). 
When the process X is continuously observed over [0, T], we have Rt indepen- 
dent and identically distributed realisations of /. Moreover for T large enough, 



Rt is of the order of T. That is why we want compare the performance of 
estimators of / in the regime Ay — > with the classical minimax rate we would 
have if X were continuously observed. 

1.2 Our Results 

We build our estimator of / using equation (3) and proceed in two steps. The 
first step is the computation of the inverse of the operator / — > Pa[/]- The 
inverse takes the form 

oo 

Pa 1 M = Y, a "^' A rK m , v e J-(M) 

where the (a m (i9, Ay)) are explicit (see Proposition 1 below). They depend on 
the intensity •& and can be estimated. We take advantage of 

/«L AiJC [P A [/]], (4) 

where La ic is the Taylor expansion of order K in A of P^ . It depends only 
on (Pa[/]*"\ tti = 1, . . . , K+ l). That step can be referred as decoumpounding 
as introduced in Buchmann et al. [2]. 

The second step consists in estimating the densities Pa[/]*"\ f° r m = 
1, . . . , K+ 1. For that we use the Nj> nonzero increments which are independent 
and with density Pa [/] • The difficulty here is that Nt is random. In Theorem 
1 we show that conditional on Nt wavelet threshold estimators of Pa[/]™ 
attain a rate of convergence -up to logarithmic factors- in N T . For T 

large enough we prove (see Proposition 2 in Section 5) that Nj- concentrates 
around a deterministic value of the order of T, giving an unconditional rate of 
convergence in T~ a ^ s ' w,p > . We inject those estimators into La,^, defined in (4), 
and obtain an estimator of / that we call estimator corrected at order K . 

The study of the rate of convergence of the estimator corrected at order 
K requires to control two distinct error terms. A deterministic one due the 
first step which is the error made when approximating / by La,x[Pa[/]] in 
(4). And a statistical one due to the replacement of the PaI/]*" 1 by estimators 
in the second step. The deterministic error decreases when K increases, then 
the idea is to choose K sufficiently large for the deterministic error term to be 
negligible in front of the statistical one. We give in Theorem 1 an upper bound 
for the rate of convergence of the estimator corrected at order K which is in 
-up to constants and logarithmic factors- 

max{T- Q ( s '^,A^ +1 }. 



It decreases with K and if there exists Kq such that 

TA 2 T Ko+2 < 1, (5) 

since a(s,ir,p) < 1/2 the estimator corrected at order Kq attains the minimax 
rates of convergence. It follows that for every At polynomially decreasing with 
T, it is possible to exhibit Kq such that (5) is valid and the estimator corrected 
at order Kq provides a positive answer to i) and ii) . If no if enables to verify 
condition (5), Theorem 1 provides an upper bound for the rate of convergence 
of the estimator corrected at order K, in that case the estimator still provide a 
positive answer to i). 

In the case of a compound Poisson processes, the results of the present paper 
generalise to some extend those of Bee and Lacour [1], Comte and Genon- 
Catalot [4, 6] and Figueroa-Lopez [10]. This is discussed in further details 
in Section 4. In Section 2 we give the main results of the paper. We properly 
define wavelet functions and Besov spaces used for the estimation before having 
a complete construction of the estimator corrected at order K. Then we give 
an upper bound for its rate of convergence for the L p loss defined in (2), p > 1, 
uniformly over Besov balls. A numerical example illustrates the behavior of the 
estimator corrected at order K in Section 3. Finally Section 5 is dedicated to 
the proofs. 



The model of this paper is central in many application fields e.g. statistical 
physics (see Moharir [17]), biology (see Huelsenbeck et al. [13]), financial series 
or mathematical insurance (see Scalas [19]). It is well adapted to study phenom- 
ena where random independent events occur at random times. For instance, 
in insurance failure theory these events can model the claims that insurance 
companies have to pay to the subscribers. The insurer's surplus at a given time 
t can be modeled by the following process 

K(t) = K Q + kt- X t , 

where Kq is the capital of the company at time 0, the second term is a determin- 
istic trend corresponding to the average income received from the subscribers 
and X is a compound Poisson process modeling the insurance claims occurring 
at random times with random amount of money at stake. It is the Cramer- 
Lundberg model; see Embrechts et al. [8] or Scalas [19]. Compound Poisson 
processes can also model the changes of an asset price in finance; see Masoliver 
et al. [15]. 



2 Main results 

2.1 Besov spaces and wavelet thresholding 

To estimate the densities (Pa[/]*"\ m = 1, . . . , K-\-l) we use wavelet threshold 
density estimators and study their performance uniformly over Besov balls. In 
this paragraph we reproduce some classical results on Besov spaces, wavelet 
bases and wavelet threshold estimators (see Cohen [3], Donoho et al. [7] or 
Kerkyacharian and Picard [14]) that we use in the next sections. 

Wavelets and Besov spaces 

We describe the smoothness of a function with Besov spaces on D. We recall 
here some well documented results on Besov spaces and their connection to 
wavelet bases (see Cohen [3], Donoho et al. [7] or Kerkyacharian and Picard 
[14]). Let (ip\) x be a regular wavelet basis adapted to the domain V. The 
multi-index A concatenates the spatial index and the resolution level j = |A|. 
Set kj := {A, |A| = j} and A = Uj>_iAj, for / in L P (M) we have 

j>-l AgAj 

where j = — 1 incorporates the low frequency part of the decomposition and 
(.,) denotes the usual L^ inner product. We define Besov spaces in term of 
wavelet coefficients, for s > and ir € (0, oo] a function / belongs to the Besov 
space B^ (V) if the norm 

\\fh^(v) := sup 2^+ 1 /2-iA)( £ |</,^>r) 1/7r (7) 

°~ l AeA, 

is finite, with usual modifications if tt = oo. 

We need additional properties on the wavelet basis (V'a) \> which are listed 
in the following assumption. 

Assumption 2. For p > 1, 

• We have for some C > 1 

£ -l 2 |A|(p/2-l) < H^ll^ < C2 |A|(p/2-l) i 



For some <£ > 0, a > and for all s < a , J > 0, we have 

\\f - E E (fMM\ Lp{ -D) < «2- J *ii/iiHt.(D)- (8) 

j<jAeAj 



Ijfp > 1, /or some <£ > 1 and /or any sequence of coefficients (n^) 

eriVuA^A , <||(El n ^l 2 ) ' , <^||E n ^ 

AeA pv ; AeA vx ' AeA 



AeA> 



L P CD) 
(9) 



For any subset Ao C A and for some (£ > 1 

I £,(2?) 



e- 1 E ii^ii^cx,) < / ( E i^(x)i 2 ) p/2 < c £ iiva||^ } . . i.u) 

AeAo ^ AeA AeA 

Property (8) ensures that definition (7) of Besov spaces matches the def- 
inition in terms of linear approximation. Property (9) ensures that (V'a)a is 
an unconditional basis of L p and (10) is a super-concentration inequality (see 
Kerkyacharian and Picard [14] p. 304 and p. 306). 

Wavelet threshold estimator 

Let (0, tjj) be a pair of scaling function and mother wavelet that generate a basis 
{ifi\)\ satisfying Assumption 2 for some a > 0. We rewrite (6) 

/ = E a ofc^ofc + E E fij^jki 

fceAo j>i keAj 

where (pok(') = </>(• — k) and ipjk( m ) = 2^' 2 ip(2^ • —k) and 

aofc = / 4>ok(x)f(x)dx 

Pjk = / ^jk{x)f{x)dx. 



For every j > 0, the set Aj has cardinality 2 J and incorporates boundary terms 
that we choose not to distinguish in the notation for simplicity. An estimator 
of a function / is obtained when replacing the (aofc) an d (Pjk) by estimated 
values. In the sequel we uses (jjk) to design either (aofc) or (/3jk) and (gjk) for 
the wavelet functions (</>ofe) or (ipjk)- 



We consider classical hard threshold estimators of the form 

J _ 

fceAo j=i keAj 

where a^k and Pjk are estimators of aofc and f3jk , J and r\ are respectively the 
resolution level and the threshold, possibly depending on the data. Thus to 
construct / we have to specify estimators (jjk) of the (jjk) and the coefficients 
J and r\. 



2.2 Construction of the estimator 

Assume that we have [TA -1 ] discrete data at times %A for some A > of the 
process X 

(Xa,... , A^ ta -ij A ). 

Introduce the increments 

V A Xi = X iA - X (i _ 1)A , for i = 1, . . . , LTA- 1 ] , 

where Xq = 0. They are independent and identically distributed since X is a 
compound Poisson process. Define 

Si = inf{i,D A X i ^0} AT 

Si = inf {j > Si-i,~D A Xj / 0} A T for i > 1, 

where Si is the random index of the ith jump and 

ITA- 1 ] 
Nt= 2^ 1{D A X^0} 

4=1 

the random number of nonzero increments observed over [0, T\. By Assumption 
1, on the event {D X{ = 0}, no jump occurred between (i — 1) A and iA. In the 
microscopic regime when A = Ay ->0asT goes to infinity many increments 
are null and convey no information on /, hence for the estimation of / we focus 
on the nonzero ones 

(d a a Si ,...,d a x 5jVt ). 

Proposition 1. The distribution of the increment D Xg x has density with 
respect to the Lebesgue measure given by 



Pa[/] = X> m (A)f 



m=l 



where 



Pm (A) = ¥(R A = m\R A / 0) = — ^ — — 



Let Aq be such that 



For A < Aq, u>e /mve t/iai 



m-2 



m! 

m=2 



1-0A <pi(A) < 1. 

It is straightforward to verify that the nonlinear operator Pa is a mapping 
from ^(R) to itself. The observations (D X5J are realisations of the density 
Pa[7] an d by Proposition 1 the weight pi(A) — } 1 in the limit A = At — > 0. 
It follows that for Ay small enough most of the (D Xs t ) have distribution 
/. Then a naive method to estimate / is to apply classical density estimators 
to the (D Xg.). That estimator requires a convergence condition on Ax to 
achieve minimax rate of convergence (see Theorem 1). However we wish to 
construct an estimator that attains minimax rates of convergence with weaker 
conditions on A7 1 . 

We adopt the estimating strategy of section 1.2 and construct an approxi- 
mation of /. 

Lemma 1. The inverse P A o/Pa, such that for all densities f in J-"(IR) if 
Pa[./] = v we have P A [v] = f, is given by 



1 °° / 1 \m+l 

_ V — (e* 

9A Z^ m V 



p A >]4V^ — (e* A -l)"V 

A L J i9A ^ m 



To build the estimator corrected at order K we use that P^ is a power series 
whose coefficients are equivalent to increasing powers of A. Then Ija,k the 
Taylor expansion of order K in A of P^ is obtained by keeping the first K + 1 
terms of the inverse 

L ^M = M^h!r (eM " 1)Vm ' *e^(R). (11) 

m=l 

Next we construct wavelet threshold density estimators of the first K + 1 con- 
volution powers of Pa[/] that will be plugged in (11). Define 

N T,m 

T%P = JT~ T, Sfity**) m>l, (12) 

' m 8=1 



where NT, m = \_Nj</m\ > 1 for large enough T and 

D^X Si = D A X 5i + V A X SNTm+l + ■■■ + V A X S(rn _ 1)NTm+t . 

The (D Xs t ) are independent and identically distributed with density Pa[/]> 
thus the (jy^Xg^ are independent and identically distributed with density 

P A [/]* m . Let n > and J G N \ {0}, define ^^ the estimator of P A [/]* m 
over T> 

J _ 
P£n{x) = Y J ^Uok(x) +EE$ )l {|^)|>,}fe(4 ^^ (13) 

Definition 1. We define /^a ^ e estimator corrected at order K for K in N 
and x inV as 



7£a(*) = E ( } ( ? a ^fr)' 

^i m ^A 



(14) 



where 



and 



d T = -^log(l-p T ) (15) 



PT 



N T 



LTA- 1 ] 
is i/ie empirical estimator of p(A) = P(i?A = 0) = 1 — e~* A . 

Lemma 1 justifies the form of the estimator corrected at order K. 

2.3 Convergence rates 

We estimate densities / which verify a smoothness property in term of Besov 
balls 

F(s,ir,Tl) = {/ G J-(M), ll/b^P) < ^}, 

where 9JT is a positive constant. We are interested in estimating / on the 
compact interval T>, that is why we only impose that its restriction to T> belongs 
to a Besov ball. 
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Theorem 1. We work under Assumptions 1 and 2. Let a > s > 1/tt, p > l/\n 
and PA T ,m be the threshold wavelet estimator o/Pa t [/]*™ on D constructed 
from (4>,ip) and defined in (13). Take J such that 

2 J N T 1 log(N T /2 ) < 1, 
and 



v = K N T 1/2 ^log{N^ /2 ), 
for some k > 0. Let 

t \ ■ / s s + 1/p-l/n \ . . 

a (s,p,ir) = mm < ,—. - r^rr- (Id) 

V F ' l2s + l'2(s + l/2-l/vr)i V ; 

1) The estimator PA T ,m verifies for large enough T and sufficiently large 

K>0 

sup (e[||C^ - p AT [fr\\ p L ( v)\ n t]) 1/p < ^ T a(w) , 

up to logarithmic factors in T and where £ depends on s,Tr,p,$Jl,(f),ij). 

2) The estimator corrected at order K f-^A defined in (14) verifies for T 
large enough and any positive constants % and % 

sup sup (E[\\f? AT - f\\ P L }f p < emaxjr^'.A^ 1 }, 
tfe[X,f]/6-F(s,7r,9K) pV ; 

up to logarithmic factors in T and where <t depends on s,n,p,0Jl,(f),ip,%, 
7 Z,K. 

The proof of Theorem 1 is postponed to Section 5. From a practical point of 
view when one computes the estimator fj^A f rom (1) the sample size is Nt, 
which is why in Theorem 1 we give the resolution level J and the threshold 
i] as functions of Nt instead of replacing Nt by its deterministic counterpart. 
Explicit bound for k is given in Lemma 4 hereafter. 

In practice the values T and At are imposed or chosen by the practitioner. 
Theorem 1 ensures that the estimator corrected at order K attains the minimax 
rate T~ a ( s,p,7r > for the smallest K such that 

a(a,p,7r) 

A T = 0(T *+r-). 
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Since a(s,p, n) < 1/2 it is sufficient to choose K such that 

TA 2 T K+2 = 0(1). 

If At decays as a power of T i.e. if there exists 5 > such that for some £ > 

A T < €T- 5 , 

it is always possible to find a correction level K satisfying the previous con- 
straint. The case K = corresponds to the uncorrected estimator; it is the 
naive estimator one would compute making the approximation / ~ Pa[/]- hi 
that case we get a rate of convergence in 

max{T- Q ( s ' p '^,A T }, 

which attains the minimax rate if T a ^ s,p,7r ' At < 1. Since a(s,ir,p) < 1/2, it 
follows that the condition T a ( s ' p,n >AT < 1 already improves on the condition 
TA T < 1 of Bee and Lacour [1], Comte and Genon-Catalot [4, 6] or Figueroa- 
Lopez [10] (see Section 4 for comparison with other works). 

3 A numerical example 

We illustrate the behaviour of the estimator corrected at order K when K 
increases and compare its performance with an oracle: the wavelet estimator 
we would compute in the idealised framework where all the jumps are observed 



j=o k 



R T -, Rt 



/° rade (z)=£a o r de <M*)+^ 

k " ' 

where 

S 0°r C ' e =^E^te) and pOracle = j_J2^ k{ ^ h 

4 = 1 8=1 

Rt being the value of the Poisson process R at time T and (£j) the jumps. The 
parameters J and 7] as well as the wavelet bases (<p, ip) are the same as those 
used to compute the estimator corrected at order K. We consider a compound 
Poisson process of intensity i? = 1 on [0, T] and of compound law 

/(aO = (l-a)/i(aO + a/ 2 (aO 
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where f\ is the density of a Gaussian A/"(0, 1) and J2 of a Laplace with location 
parameter 1 and scale parameter 0.1, we take a = 0.05. 




Figure 1: Density / : f(x) = 0.95/i(a:) + Q.Q5f 2 (x) x € [-6,6]. 



We estimate the mixture / (see Figure 1) on T> = [—6,6] with the estimator 
corrected at order K for different values of K and study the results with the 
L2 error. We also compare them with the oracle f° rade . Wavelet estimators 
are based on the evaluation of the first wavelet coefficients, to perform those 
we use Symlets 4 wavelet functions and a resolution level J = 10. Moreover we 
transform the data in an equispaced signal on a grid of length 2 with L = 8, 
it is the binning procedure (see Hardle et al. [11] Chap. 12). The threshold is 
chosen as in Theorem 1. The estimators we obtain take the form of a vector 
giving the estimated values of the density / on the uniform grid [—6, 6] with 
mesh 0.01. We use the wavelet toolbox of Matlab. 

Figure 2 represents the corrected estimator for K = and K = 1 and the 
oracle. All the estimators are evaluated on the same trajectory. They manage 
to reproduce the shape of the density /. As expected the oracle looks better 
than the other two and the uncorrected (K = 0) seems to make larger errors 
than the 1-corrected in estimating /. Figure 3 represents for every values in 
[—6, 6] the absolute distance between those estimators -evaluated on the same 
trajectory- and the true density /. Therefore it enables to determine in which 
area an estimator fails to estimate / and to get an idea of the error made. 
The graphic was obtained after M = 1000 Monte-Carlo simulations of each 
estimator and averaging the results. The uncorrected estimator is not as good 
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as the estimator corrected at order 1. The oracle and the estimator corrected 
at order 1 seem to have similar performances. Each of the estimators makes 
larger errors around 1 which is where the density / is peaked. 




Figure 2: Estimators of the density / (plain grey) for T = 10000 and A = 0.1: 
the uncorrected (dotted red), the 1-corrected (dashed green) and the oracle 
(plain dark). 
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Figure 3: Mean absolute error between the estimators and the true density 
(M=1000, T = 10000 and A = 0.1): the uncorrected (dotted red), the 1- 
corrected (dashed green) and the oracle (plain dark). 



Evaluation of the L2 errors enables to confirm the former graphical obser- 
vation. We approximate the Li errors by Monte Carlo. For that we compute 
M = 1000 times each estimator (for T = 10000 and A = 0.1) and approximate 
the L2 loss by 

M 1200 

M E ( E (^" 6 + °- 01 ^) - f(~ 6 + 0-0!p)) 2 x 0.0l) , 
t=l P =o 

where / is one of the estimators. For each Monte Carlo iteration the corrected 
and oracle estimators are evaluated on the same trajectory. The results are 
reproduced in the following table. 



Estimator 


Oracle 


K = 


K = 1 


K = 2 


K = 3 


L2 error (xl0~ 4 ) 


0.1117 


0.1842 


0.1353 


0.1350 


0.1350 


Standard deviation (xl0~ 5 ) 


0.3495 


0.4434 


0.4363 


0.4366 


0.4366 



This confirms that there is an actual gain in considering the estimator corrected 
at order 1 instead of the uncorrected one. In the following table we estimate 
the (p m (A)) defined in Proposition 1. 
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Estimated quantity 


Pi 


P~2 


P3 


Estimation 


0.9508 


0.0476 


0.0016 


Standard deviation 


0.0022 


0.0022 


0.0004 



It turns out that without the correction we estimate the density / on a data 
set where 5% of the observations are realisations of a law which is not /. This 
explains why it is relevant to take them into account when estimating /. Con- 
sidering more than 1 or 2 corrections is unnecessary as the L2 losses get stable 
afterwards. The L2 loss of the oracle is strictly lower than the loss of the es- 
timator corrected at order K, even for large K. That difference is explained 
by the fact that to estimate the mth convolution power we do not use Nt data 
points but Nt^tu = \_Nt/tti\ ■ Therefore we do not loose in terms of rate of 
convergence, but we surely deteriorate the constants in comparison with the 
oracle. Numerical results are consistent with the theoretical results of Theorem 
1 where we proved a rate of convergence for the estimator corrected at order K 
in 

max{T- Q ^ p ^,A^ +1 }. 

Since a(s,p,Tt) < 1/2, the rate decreases with K and becomes stable once 
A T + T < £. In the numerical example we took T = 10000 and A = 0.1 thus 
TA 4 = 1 which explains why in the example we did not observe improvements 
when correcting with K greater than 2. 



4 Discussion 

4.1 Relation to other works 

A compound Poisson process is a pure jump Levy process and can be studied 
accordingly using Levy-Kintchine formula. Estimating the jump density / is 
then equivalent to estimating the Levy measure since for compound Poisson 
process it is the product $f(x)dx. A possible estimation strategy in that case is 
to provide an estimator of the Fourier transform of the density. That strategy is 
quite different from the one introduced in this paper but is usually adopted when 
estimating the compound law of a compound Poisson process (see Figueroa- 
Lopez [10], Comte and Genon-Catalot [4, 6] or Bee and Lacour [1]). 

The nonparametric estimation of the Levy measure from the discrete obser- 
vation of a pure jump Levy process from high frequency data (which corresponds 
to our microscopic regime Ay — } 0) has been studied in great detail by Comte 
and Genon-Catalot [4, 6] and Figueroa-Lopez [10]. In [10] the nonparametric 
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estimation of the Levy density is made via a sieve estimator. They show that 
it attains minimax rates of convergence for the L2 loss uniformly over a class of 
Besov functions for a sampling size Ay such that -with our notation- TAy < 1. 
Comte and Genon-Catalot [4, 6] construct an adaptive nonparametric estimator 
of the Levy measure, which attains minimax rates of convergence on Sobolev 
spaces for the L2 loss for a sampling size At such that TAy < 1 (or TAj, < 1 
under smoother assumptions). Bee and Lacour [1] obtained similar results when 
TAy < 1. The statistical setting of [6] is more general since they estimate the 
Levy measure from observations of a Levy process with a Brownian component. 

Our result is limited to the Poisson case contrary to Bee and Lacour [1], 
Comte and Genon-Catalot [4] and Figueroa-Lopez [10] who worked on the larger 
class of pure jump Levy processes. However in the case of a Poisson process we 
generalise them since we provide an adaptive density estimator which attains 
minimax rates of convergence, for the L p loss, p > 1, uniformly over Besov 
balls for regime where Ay is polynomially slow. If A^ decays even slower, 
for instance logarithmically in T, we still have an upper bound for the rate of 
convergence of our estimator. 

4.2 Possible extensions 

In this paper we give an adaptive minimax procedure for the estimation of the 
compound density of a compound Poisson process in the microscopic regime. 
The same estimation problem in an intermediate regime, namely when the 
process is observed at a sampling rate A > fixed, has been studied in van 
Es et al. [20] and in the more general setting of Levy processes by Comte and 
Genon-Catalot [5] and Reifi [18]. van Es et al. [20] provide a consistent kernel 
density estimator of the compound density of a compound Poisson process of 
known intensity. They also focus on the nonzero increments for the estimation, 
but sidestep the problem of the random number of data Ay by assuming that 
they have a sample of a given size. 

The estimator corrected at order K presented here should extend to inter- 
mediate regime where Ay — > Aqo < 1 and the rate of convergence given in 
Theorem 1 should generalise in 

max{T- Q ^ p ^,A^ +1 }. 

An improvement of the results would be the estimation of the compound den- 
sity of renewal reward processes, or Continuous Time Random Walk, where it 
is no longer imposed that the elapsed time between jumps is exponentially dis- 
tributed. Then the Levy property is lost, the increments of the renewal process 
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are no longer independent nor identically distributed. An estimation strategy 
based on the Levy-Kintchine formula is not possible. Such processes enable to 
model random phenomena where the elapse time between events is not memo- 
ryless; they have many applications for instance in finance (see Meerschaert et 
al. [16]), in biology (see Fedotov et al. [9]) or for modelling earthquakes (see 
Helmstetter et al. [12]). 



5 Proof of Theorem 1 

In the sequel £ denotes a generic constant which may vary from line to line. Its 
dependencies may be indicated in the index. 

5.1 Proof of part 1) of Theorem 1 

Preliminary lemmas 

To prove part 1) of Theorem 1 we apply the general results of Kerkyacharian 
and Picard [14]. For that we establish some technical lemmas. 

Lemma 2. /// belongs to J~(s, it, 9JT) then for m > 1, Pa [7]**™ o-lso belongs to 
T(s,ir,V3t). 

Proof of Lemma 2. It is straightforward to derive ||PA[/]* m || L m\ = 1- The 
remainder of the proof is a consequence of the following result: Let / G £>£oo 
and g £ L\ we have 

||/*SlUoo < [l/IUooll^lUiCR)- (0) 

To prove the (<0>) we use the following norm which is equivalent to the Besov 
norm (see [11]) 



+ lk (n) ll r „™ + 



w 2 M n \t) 
t a 



(17) 



where s = n + a, n GN and a G (0, 1], and w is the modulus of continuity 

wl(u,t) = sup || D^ ML (rv 

\h\<t ,rW 

where D ft [z^](x) = v{x — h) — u(x). The result is a consequence of Young's 
inequality and elementary properties of the convolution product. We use the 
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definition (17) of the norm and treat each term separately. First Young's in- 
equality gives 

Wfl* f'2\\L^(R) < ||/i||l^(M)||/2||l 1 (R)- (18) 

Then the differentiation property of the convolution product leads for n > 1 to 



d n 
dx" 



(/l*/ 2 ) 



d n 

dx™ 



fi */ 2 



< 



d n 
dx 1 



-h 



WhUm- ( 19 ) 



Finally translation invariance of the convolution product enables to get 



\n h T> h [(f^f 2 )^]\\ LAR) = ||(D h D fc [/f n) ])*/: 

< ||D fc D*[/W]L a 



2 IIl^(k) 

ll/2|Ui( 



(20) 



Inequality ((}) is then obtained by bounding (17) with (18), (19) and (20) lead 
to the result. To complete the proof of Lemma 2, we apply m — 1 times {()) 
which leads to 



VmGN\{0}, ||P A [/n 



< ||Pa[/]| 



The triangle inequality gives ||Pa[/] 
the proof. 



\\STTOO -^ 



sttoo < 9K which concludes 

□ 



Lemma 3. Let 2 3 < Nt then for all m G N \ {0} and for p > 1 we have 



^(m) (m) \p 



TwH^-im) ("in \P\t\t 1 ^ re 

HlTjk - l)k\ \N T \ < £ p ,m,\\ 9 \K(M),™,# 



N, 



-p/2 



T ' 



Jm) 



where 7 - k is defined in (12) and 



r ik 



(m) 



gjk(y)PA[fr m (y)dy. 



(21) 



Proof of Lemma 3. The proof is obtained with Rosenthal's inequality: let p > 1 
and let (Yi, . . . , Y„) be independent random variables such that E[Yj] = and 
E[JYj| p ] < 00. Then there exists £ p such that 



E 



n l ( n n 11 1 

X>[ <£ p E E 0^i p ] + (E E 0^i 2 ]) • (22) 



The (D^XsJ are independent and identically distributed with common den- 
sity P At \fX m and E[^f] = ffi . Then 7^ - 7 j™ } is a sum of iY T , m = LiV T /mJ 
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centered, independent and identically distributed random variables. It follows 
that 



" ! '' Z + k ^d: 



E[|^ fc (D^X Si )| p ] <2^/ 2 y \g{y y -k)\vV± T [fY m {y)dy 
= 2P2 ,( P /2-i)|| 5(2) | PPat[/j ^ ^ 

<2^^ 2 - 1 )|| 9 |r Lp(M) ||P AT [/]^|| 00 , 

where we made the substitution z = 2^x — k. To control ||PA T [/]* m ||oo we use 
the Sobolev embeddings (see [3, 7, 11]) 

^Voo "~> "poo an d "ttoo '~^ "oooo> (23) 

where p > it, sit > 1 and s' = s — 1/ir + 1/p, it follows that 

ii d rfl^mii ^ <r i I'd r-fi*"^ii 
||Pa t [/J ||oo < C a , w ||PA T l/J llB^pjj- 

We deduce from Lemma 2 that ||PA T L/]* m ||oc < C S)W 3Jt. We get 

and E[\g jk (T>iT XSt )\ 2 } < Tl since \\gf 2 = 1. 

The accept-reject algorithm ensures that for all n > 1 the increments 
(D t Xs 1 ,- ■ • , D T Xs n ) are independent of iVr and then Nt,™.- Indeed the 
(D T Xi,i = 1, . . . , [TA^, J) are independent and identically distributed and 
the (D At X Si ) are constructed with Si = inf {j > Si-i, D At X, / 0}. There- 
fore we can apply Rosenthal's inequality conditional on Nt to 7 -™ — l)™ an d 
derive for p > 1 



2 J \p/2-i, 



Efl^-^rW <^{2 P (^) " \\ 9 \\l m Tl + M^}N 



-p/2 



This concludes the proof. □ 

Lemma 4. Choose j and c such that 

i/2\ ,9 16m /_ cllol 



2^'AT" 1 log (iVy /2 ) < 1 and c 2 > — - (SOT + 



00 



3 V 6 

For all m £N\ {0} an<i r > 1 let n r = cr. We have 



»_»i ^ ^Ar-Va.A /xri/2 



where 7-™ is defined in (12) and j^™ in (21). 
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-r/2 



Proof of Lemma 4- The proof is obtained with Bernstein's inequality. Consider 
Yl,...,Y n independent random variables such that \Yi\ < 21, E[l^] = and 
b l = TJi=i E K 2 ]- Th en for any A > 0, 



n 
'(iX^I > A ) - 2eXP 



i=l 



A 2 



m + f: 



(24) 



For all m > 1, Jj™ 



7-k is a sum °f ^T,-, 



-- [Nt/tr] centered indepen- 
dent and identically distributed random variables bounded by 2- ? ' 2 ||g|| 00 and 
E[|gj£.(D^ T Xs t )| ] < Wl. The accept-reject algorithm ensures that for all n > 1 
the increments (D t Xs x , ■ ■ ■ , D T Xs n ) are independent of Nt (see proof of 
Lemma 3), we apply Bernstein's inequality conditional on Nt- We have 



'(i^r-Tjru-^'-vi-M.v; 



5 (m) (m)i 



-1/2 



1/2N 



N T 



l/2x 



<2exp 



=2exp 



K^Mp 1 log (JVf )iV, 



T,m 



,. m , ^JVT, m JV T 1/2 y / log« 2 )2^ 2 |lg|l 



<?rN^ l N T . 



8 9JI + 



Kr Ar 7 : 1/2 A /log(Af^ /2 )2J/2|| 



■ log (iV, 



^ /2 ) 



Using that 



mN^N^m 



m 

Nr~ 



N 



r 



m 



> 



2' 



for T large enough and 2^' 2 N T y log (Ay ) < 1 it follows that 

[\V-^ ] \>^ T 1/2 ^o g (N^)\N T ) 
< 2 exp 



3c^r 



16m(m + 



«r-||g|| 







rlog(A^ /2 ) 



With c 2 > ^ (571 + *o ) we get 



(m) 



%'fc 



7^1 > f JV? 



1/2 



rV2> 



log(iV^) N T )<N^. 



-r/2 



The proof is complete. 



□ 
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Completion of proof of part 1) of Theorem 1 

Part 1) of Theorem 1 is a consequence of Lemma 2, 3, 4 and of the general 
theory of wavelet threshold estimators of [14]. It suffices to have conditions 
(5.1) and (5.2) of Theorem 5.1 of [14], which are satisfied -Lemma 3 and 4— 
with c(T) = N T and A n = c(T)~ l (with the notations of [14]). We can now 
apply Theorem 5.1, its Corollary 5.1 and Theorem 6.1 of [14] to obtain the 
result. 



5.2 Proof of part 2) of Theorem 1 

Preliminary result 

The result of part 1) of Theorem 1 where given conditional on Nt- To prove part 
2) we replace Nt by its deterministic counterpart. We introduce the following 
result. 

Proposition 2. For all r > 0, there exist 1 < £# < oo, where $ — > £$ is 
continuous, such that 

l/£#T- r < E[N~ r ] < <£#T- r . 

Proof of Proposition 2. We have 

LTA" 1 ] 

N T = z2 1 {D A TX^0}' 
t=l 

where 

E^W = p( A t) = 1 - exp(-tfA T ). 

Introduce Yi = 1{d a tx^o) ~~ p(^-t), the Yi are centered independent and 
identically distributed random variables bounded by 2 and K[l^ 2 ] < p(At), it 
follows from Bernstein's inequality (24) that for A > 

We choose A =p(A^)/2, on the set { r^rAri — p(^t)\ < A} we have 
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Moreover for At small enough we have that 

i) 



< p{A T ) = 1 - exp(-i?A r ) < #A T . 



We have for all A > 



E[N T r ] = E 



"r" 1 ^-*^} 



+ E 



■^{iB&j-^ls*} 



Since for r > the function x — > x r is decreasing and iVr > 1 we have using 
(25) the upper bound 



E[Nj 



< 



N T 



LTA" ] 



p(A T ) 



T 



> P (A T ) ^ + ATA^^At) 



< exp 



< exp 

For the lower bound we have 



[TA^\p(A T ) 2 
8(p(A T ) + rffzl) 

64 T + (x) ' 



+ 



LTA ? 1 jp(A T ) 



E[7V, 



-j> 



3[TA^\p(A T ) yr > /3ri?y 



Then there exists 1 < (£# < oo with i? — )• <£# continuous such that 

i/^r~ r <E[iVy r ] <ut~ t . 

The proof is now complete. 



□ 



Completion of proof of part 2) of Theorem 1 

To prove Theorem 1 we define the quantity for K in N and sinD 

K + 1 ( nm+l (p'dA _ i\ m 

?aw = e ( i ( ^e ^, m (x). 



m=\ 



/?) 



tfA 



It is the estimator of / one would compute if i? were known. We decompose 



the L p error as follows 



,Vp 



e[II/£a t - f\\ p Lp(p) ]) IP <m\fT AT - f¥ AT Wl pm ]) 

+{n\f¥ AT -ni m ]) 



Mp 



l/p 



23 



V 
L V (V) 



and control each term separately. 

First we control E[||/|J' A — f\\ v L m}] > usm S the triangle inequality we get 

r:: K + 1 ( l\m+l (A _ 1^ 

[IE 1 ' 1 ! ( , At ' p^-vZi^w 

m=l 
K+l ( #A T _l) m _,, 

s E [ mt J m^-^AfrWi^f (26) 

m=l 

+ E l m ,A T ' l|PA T [/]-|k(»). (27) 

771=^ + 2 

To bound (26) we use part 1) of Theorem 1 in which the supremum is taken 
over the class {PA T [/]* m & J r (s,Tr,TV)}. With the inclusion 

{Pa t [fV m , f € ^(a, vr, !OT)} C T(s, vr, JOT) 

and Proposition 2 applied with r = a(s,p, n)p > 0, we deduce the upper bound 
for m > 1 

E[||i^ - Pa 1 , [Pa t [/]|| P Lp(c) ] < CE^'*^ 

< er ^*^, (28) 

where (£ depends on (s,n,p,9Jt,(f>,ip,K,'&). To bound (27) Young's inequality 
and 1 1 P a t [/] 1 1 £, cr-) = 1 enable to get 

||PA T [/r|| MM) <||PA T [/]|| MR) farm>l. 

The triangle inequality leads to ||Pa t [/]|| l rm\ < ||/||l p (R) an d we use the 
Sobolev embeddings (23) to get ||/||l (R) < Cs^plOT- We derive the upper 
bound 

V I " Up a m* m ll 

^ m tfA T 'I ArL - /J HipW 

m=K+2 

oo 1 / e ^A T _ -j\m 

< II/IU p (r) 2^ 



771=^+2 

< £^,smAT^ +1 . (29) 

Thus from (28) and (29) we obtain 

sup m\f¥ AT - f\\ P Lp(v) }f P < £max{r-(w) )A ^ + i }) 
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where <£ depends on (s,7r,p,9Jl,(f),'ip,K,'&). Since i? — )• £ is continuous we get 
for p > 1 

sup sup (E[||/£ At - f\\l }) 1/p < £ K , m m a *{T- a ( s ^\A« +1 }, 

tfeg,S] f£T(s,ir,Wt) ' p( ' 

where £ depends on (s, n, p, Wl, (f>, tp, K) 



We now control E[||/^ A — /|f A \\ P L ,^-A and use (15) to derive 

jk _^ 1 (-ir (d-fT)- i -ir r - 

' i *-± m log(l — pt) 

where PA T ,m does not depend on i? (see (12)). Define 

G (x) - ((1 " X) ~ l ~ 1)m 
log(l — X) 

The triangle inequality leads to 



(ie[II®a t (^-/£a t IIL (c) ]) 1/p 



/t,a t iil p (d)J 

K+l 



< £ (E[||(G m (p T )-G m (p(A r )))iW l ||^ (c) ]) 1/p , 



m=l 



where p(Ay) verifies p(Aj-) = 1 — e T < (E^-jAy since 

< 1 _ e -5A T < ! _ e -^A T < x _ e -XA T < L 

Moreover, we have 

G^IX) = — r— : : : — r + 



(1 - X) m + X log(l -X) (! _ x )m+l ( log (i _ x ^ 



2' 



then for all m > 1 xG' m (j;) is continuous over (0, 1/2] and converges to when 
x — > 0. We deduce 

ie[II/^a t (^)-/£a t IIl p ^)] 1/p 

< ^^ a ? 1e [||(pt-p(a t ))^;ii^ (2?) ] 1/p . 
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Cauchy-Schwarz inequality leads to 
E[\\(pt- P (A t ))P^ 



\P ]2 



< E 



\(p T -p(A T ))\ 



E 



Pa 



,2p 
t,™\\l 2p (V) 



where using part 1) of Theorem 1 and that Nt > 1 we have 



E 



\Pa t ,t 



,2p 
\L 2p (D) 



< E 



*m 1 1 2p 

2p(D) 



I Pa "Pa \f]* m \\ 2p _I_II"Pa \f]* m \\ Z l 

|-TA T ,m -rA T [Jl \\l 2p (V)\ + II^A T WJ Hi 
< CE[Af T 2a(s ' p '" )p ] + OTT 2p 



(30) 



where <£ depends on (s,tt,p, 9JT, 0, ■*/>)• We apply Rosenthal's inequality (22) 
to conclude the proof: pt — p{At) is the sum of independent and identically 
distributed centered random variables 

(Y = 1 {d a tX ^ 0} - p(A T ), i e {1, . . . , LTA T X J }) 

where E[|Y| 2 p] < 2 2 pe[1 2 ^ Atx ^ 0} ] < £ Pi% A T and E[|Y| 2 ] < £^A T . Rosen- 
thal's inequality (22) gives 

E[\\p T - P (A T )\\l;} 

< t^lTA^r^HTA^lAT + (LTA T 1 j A T f ). (31) 
It follows from (30) and (31) that 

< ca^ lta t 1 j - 1 (r J /(2p) + r 1 / 2 ) , 

where £ depends on (s,7r,p, 9JT, 0, ip,%%, K). We deduce for p > 1 

su P _ sup (E[||/^ At (^) - /^ AT r LpP) ]) 1/p 



^e[X,X]/6^"(s,7r,9Jt) 



< e:( T -(i-i/(2p)) + T -i/2^ 



where <£ depends on (s, 7r, p, 9JT, (f>, ip, %, T, K) and which is negligible compared 
to T~ a ( s,p,7T ' since a(s,p, it) < 1/2. The proof of Theorem 1 is now complete. 
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6 Appendix 

6.1 Proof of Proposition 1 

Let x 6 R, we have by stationarity of the increments of the process X 

P(D A X 5l <x) = F(X A < x\X A / 0) 

oo 

= Y^ F ( X A < x\R A = m,R A ^ 0)F(R A = m) 

m=0 

oo 

= ^p m (A)P(X A <x| J R A 



m 

m=l 

rx 



where f(X A < x\R A = m) = J_ f* m (y)dy for m > 1. It follows 
F(B A X Sl <x)= f P A [f}(y)dy. 

J — oo 

Immediate computation give the expression of p m (A). For the control of pi(A) 
the assertion pi(A) < 1 is immediate since pi(A) is a probability. Moreover we 
have 

exp(^A)-l = ^A(l + ^A^^-^), 

m=2 

where 

00 (^A) m ~ 2 1 1 

5( A ) : = E S^ " = 7^ ( exp (,?A) " l ~ ^ A) -" 2 as A -" °" 

m=2 " \Vl\) 

Since g is continuous, there exists Ao > such that for all A < Ao we have 
5(A) < 1. It follows for A < A that 



6.2 Proof of Lemma 1 

Let F[/] denote the Fourier transform of / and take h such that h = Pa[/]- 
Using the one-to-one mapping between densities and their Fourier transform 
we show the relation for the Fourier transforms. The linearity of the Fourier 
transform and the relation F[f * g] = F[f]F[g] give 

m=l 
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from which we deduce 



= log (1 + (e^ A - l)F[h]) = f, (-l) m+1 (e^-ir 

ra=l 

as ||(e* A - l)F\h}\\ < \\e# A - ill < 1 holds for A < log 2. We take the 

II v ' L J lloo II Moo — ° 

inverse Fourier transform of the equality to obtain the result. 
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